Sequential Application of Multivariate Outlier Test : a Robust Approach
نویسنده
چکیده
Identification of outliers in multivariate data is not trivial. especially when there exists several outliers in the data. The classical identification method based on the sample mean and sample covariance matrix cannot always find them, because the classicd rnean and covariance matris are themselves affected by outliers. This problem is termed as masting7 because the outliers get maslied by each other. To avoid the masking effect. robust estimates of mean and covariance are siiggested by many aut hors. This thesis deals with the problem of identifying and testing a set of a number k of extreme sample points as significant outliers in a sample of size n drawn from a pdimensional normal distri bution wit h unknown parameters. A robust sequential proccdurc is suggcstcd for thc idcntification of multipleoutliers in multivariate normal data. Chapter 1 gives a brief idea about outliers in statisticai data. In Chapter 2 %e review the rejection techniques for single and multiple outliers suggested by different authors. GVe also review some recent papers which deal with the problem of accommodation and identification of multiple outliers using robust procedures. In Chapter 3 we propose a test criterion for the detection of multivariate outliers based on the robust estimates of rnean and covariance. Chapter 4 deals with the application of our proposed test procedure with different examples. A simulation study is carried out to support the good behaviour of the proposed sequential test when the data are multivariate normal. We also study the performance of the proposed sequential test in the presence of outliers.
منابع مشابه
Local multivariate outliers as geochemical anomaly halos indicators, a case study: Hamich area, Southern Khorasan, Iran
Anomaly recognition has always been a prominent subject in preliminary geochemical explorations. Among the regional geochemical data processing, there are a range of statistical and data mining techniques as well as different mapping methods, which serve as presentations of the outputs. The outlier’s values are of interest in the investigations where data are gathered under controlled condition...
متن کاملApplication of Outlier Robust Nonlinear Mixed Effect Estimation in Examining the Effect of Phenylephrine in Rat Corpus Cavernosum
Ignoring two main characteristics of the concentration-response data, correlation between observations and presence of outliers, may lead to misleading results. Therefore the special method should be considered. In this paper in to examine the effect of phenylephrine in rat Corpus cavernosum, outlier robust nonlinear mixed estimation is used. in this study, eight different doses of phenylephrin...
متن کاملSimultaneous robust estimation of multi-response surfaces in the presence of outliers
A robust approach should be considered when estimating regression coefficients in multi-response problems. Many models are derived from the least squares method. Because the presence of outlier data is unavoidable in most real cases and because the least squares method is sensitive to these types of points, robust regression approaches appear to be a more reliable and suitable method for addres...
متن کاملRobust tests for testing the parameters of a normal population
This article aims to provide a simple robust method to test the parameters of a normal population by using the new diagnostic tool called the “Forward Search” (FS) method. The most commonly used procedures to test the mean and variance of a normal distribution are Student’s t test and Chi-square test, respectively. These tests suffer from the presence of outliers. We introduce the FS version of...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کامل